Applying Decay Strategies to Branch Predictors for Leakage Energy Savings

نویسندگان

  • Zhigang Hu
  • Philo Juang
  • Kevin Skadron
  • Douglas W. Clark
  • Margaret Martonosi
چکیده

This paper shows that substantial reductions in leakage energy can be obtained by deactivating groups of branch-predictor entries if they lie idle for a sufficiently long time. Decay techniques, first introduced by Kaxiras et al. for caches, work by tracking accesses to cache lines and turning off power to those that lie idle for a sufficiently long period of time (the decay interval). Once deactivated, these lines essentially draw no leakage current. The key trick is in identifying opportunities where an item can be turned off without incurring significant performance or power cost. Branch predictors are, like caches, large array structures with significant leakage; as such, it is natural to consider applying decay techniques to them as well. Applying decay techniques to branch predictors is, however, not straightforward. The overhead for applying decay to individual counters in the predictor is prohibitive, so decay must be applied to groups of predictor entries. The most natural grouping is a row in the square data array used to implement the branch predictor, but then decay will only be successful if entire rows lie idle for sufficiently long periods of time. This paper shows that branch predictors do exhibit sufficient spatial and temporal locality to make decay effective for bimodal, gshare, and hybrid predictors, as well as the branch target buffer. In particular, decay is quite effective when applied intelligently to hybrid predictors, which use two predictors in parallel and are among the most accurate predictor organizations. Hybrid predictors are also especially amenable to decay, because inactive entries in one component can be left inactive if the other component is able to provide a prediction. Overall, this paper demonstrates that decay techniques apply more broadly than just to caches, but that careful policy and implementation make the difference between success and failure in building decay-based branch predictors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Managing Leakage for Transient Data: Decay and Quasi-Static Memory Cells

Much of on-chip storage is devoted to transient, often short-lived, data. Despite this, virtually all on-chip array structures use sixtransistor (6T) static RAM cells that store data indefinitely. In this paper we propose the use of quasi-static four-transistor (4T) RAM cells. Quasi-static 4T cells provide both energy and area savings. These cells have no connection to Vdd thus inherently provi...

متن کامل

Implementing Decay Techniques using Quasi-Static Memory Cells

This paper proposes the use of four-transistor (4T) cache and branch predictor array cell designs to address increasing worries regarding leakage power dissipation. While 4T designs lose state when infrequently accessed, they have very low leakage, smaller area, and no capacitive loads to switch. This short paper gives an overview of 4T implementation issues and a preliminary evaluation of leak...

متن کامل

Ultra Low Power Cooperative Branch Prediction

Branch Prediction is a key task in the operation of a high performance processor. An inaccurate branch predictor results in increased program run-time and a rise in energy consumption. The drive towards processors with limited die-space and tighter energy requirements will continue to intensify over the coming years, as will the shift towards increasingly multicore processors. Both trends make ...

متن کامل

Using Branch Prediction Information for Near-Optimal I-Cache Leakage

This paper describes a new on-demand wakeup prediction policy for instruction cache leakage control that achieves better leakage savings than prior policies, and avoids the performance overheads of prior policies. The proposed policy reduces leakage energy by more than 92% with only less than 0.3% performance overhead on average. The key to this new on-demand policy is to use branch prediction ...

متن کامل

Exploring branch target buffer access filtering for low-energy and high-performance microarchitectures

Powerful branch predictors along with a large branch target buffer (BTB) are employed in superscalar and simultaneous multi-threading (SMT) processors for instruction-level parallelism and thread-level parallelism exploitation. However, the large BTB not only dominates the predictor energy consumption, but also becomes a major roadblock in achieving faster clock frequencies at deep sub-micron t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002